collecting highly parallel data for paraphrase evaluation published presentations and documents on DocSlides.